69 research outputs found
Graph Based Semi-supervised Learning with Convolution Neural Networks to Classify Crisis Related Tweets
During time-critical situations such as natural disasters, rapid
classification of data posted on social networks by affected people is useful
for humanitarian organizations to gain situational awareness and to plan
response efforts. However, the scarcity of labeled data in the early hours of a
crisis hinders machine learning tasks thus delays crisis response. In this
work, we propose to use an inductive semi-supervised technique to utilize
unlabeled data, which is often abundant at the onset of a crisis event, along
with fewer labeled data. Specif- ically, we adopt a graph-based deep learning
framework to learn an inductive semi-supervised model. We use two real-world
crisis datasets from Twitter to evaluate the proposed approach. Our results
show significant improvements using unlabeled data as compared to only using
labeled data.Comment: 5 pages. arXiv admin note: substantial text overlap with
arXiv:1805.0515
Research report on Bengali NLP engine for TTS
Includes bibliographical references (page 5).This report describes the Bengali NLP processor for TTS, along with the challenges faced in developing the NLP processor.Firoj Ala
CrisisMMD: Multimodal Twitter Datasets from Natural Disasters
During natural and man-made disasters, people use social media platforms such
as Twitter to post textual and multime- dia content to report updates about
injured or dead people, infrastructure damage, and missing or found people
among other information types. Studies have revealed that this on- line
information, if processed timely and effectively, is ex- tremely useful for
humanitarian organizations to gain situational awareness and plan relief
operations. In addition to the analysis of textual content, recent studies have
shown that imagery content on social media can boost disaster response
significantly. Despite extensive research that mainly focuses on textual
content to extract useful information, limited work has focused on the use of
imagery content or the combination of both content types. One of the reasons is
the lack of labeled imagery data in this domain. Therefore, in this paper, we
aim to tackle this limitation by releasing a large multi-modal dataset
collected from Twitter during different natural disasters. We provide three
types of annotations, which are useful to address a number of crisis response
and management tasks for different humanitarian organizations.Comment: 9 page
Text to speech for Bangla language using festival
Includes bibliographical references (page 6-7).In this paper, we present a Text to Speech (TTS) synthesis system for Bangla language using the open-source Festival TTS engine. Festival is a complete TTS synthesis system, with components supporting front-end processing of the input text, language modeling, and speech synthesis using its signal processing module. The Bangla TTS system proposed here, creates the voice data for festival, and additionally extends festival using its embedded scheme
scripting interface to incorporate Bangla language support. Festival is a oncatenative TTS system using diphone or other unit selection speech units. Our TTS implementation uses two different kinds of these concatenative methods supported in Festival: unit selection and multisyn unit selection. The function of a Text-to-Speech system is to convert some language
text into its spoken equivalent by a series of modules. These modules, constituting the TTS system are described in detail which is very much helpful for future development. Finally, the quality of synthesized speech is assessed in terms of acceptability and intelligibility
- âĻ